The molecular characteristics could supplement the staging system of pT2/T3N0M0 esophageal squamous cell carcinoma: a translational study based on a cohort with over 20 years of follow-up

Objective This study aimed to construct a model based on 23 enrolled molecules to evaluate prognoses of pT2/3N0M0 esophageal squamous cell carcinoma (ESCC) patients with up to 20 years of follow-up. Methods The lasso-Cox model was used to identify the candidate molecule. A nomogram was conducted to develop the survival model (molecular score, MS) based on the molecular features. Cox regression and Kaplan-Meier analysis were used in this study. The concordance index (C-index) was measured to compare the predicted ability between different models. The primary endpoint was overall survival (OS). Results A total of 226 patients and 23 proteins were enrolled in this study. Patients were classified into high-risk (MS-H) and low-risk (MS-L) groups based on the MS score of 227. The survival curves showed that the MS-L cohort had better 5-year and 10-year survival rates than the MS-H group (5-year OS: 51.0% vs. 8.0%; 10-year OS: 45.0% vs. 5.0%, all p < 0.001). Furthermore, multivariable analysis confirmed MS as an independent prognostic factor after eliminating the confounding factors (Hazard ratio 3.220, p < 0.001). The pT classification was confirmed to differentiate ESCC patients’ prognosis (Log-rank: p = 0.029). However, the combination of pT and MS could classify survival curves evidently (overall p < 0.001), which showed that the prognostic prediction efficiency was improved significantly by the combination of the pT and MS than by the classical pT classification (C-index: 0.656 vs. 0.539, p < 0.001). Conclusions Our study suggested an MS for significant clinical stratification of T2/3N0M0 ESCC patients to screen out subgroups with poor prognoses. Besides, the combination of pT staging and MS could predict survival more accurately for this cohort than the pT staging system alone. Supplementary Information The online version contains supplementary material available at 10.1186/s12935-024-03286-5.

The molecular characteristics could supplement the staging system of pT2/T3N0M0 esophageal squamous cell carcinoma: a translational study based on a cohort with over 20 years of follow-up Introduction Esophageal cancer (EC) is a relatively common malignant tumor of the digestive system in human beings, and according to statistics, there were 572,034 new cases (3.2%) and 508,585 deaths (5.3%) of EC globally in 2018 [1].Esophageal squamous carcinoma (ESCC), as one of the main pathological types, accounted for up to 90% of EC, with obvious geographic distribution characteristics [2].Traditional treatment of ESCC includes endoscopic esophageal mucosal dissection, surgical resection, and radiotherapy [3,4].With the recent exploration of the pathogenesis, the discovery of various biomarkers, and the application of immunotherapy, the life expectancy of EC patients has improved to a certain extent, but the overall 5-year survival rate remains between 15% and 25% [5].Tumor staging is crucial for the development of effective treatment strategies and the evaluation of prognosis.The American Joint Committee on Cancer (AJCC) tumor-node-metastasis (TNM) system is a universally adopted language for oncology, and it is widely used as a classical means of prognostic evaluation [6].However, EC is highly heterogeneous [7][8][9].Patients with the same TNM stage may have inconsistent clinical outcomes, and even the 5-year survival rate of patients with cT1N0 stage is worse than that of patients with cT1N+ [10].Therefore, there is an urgent need to develop a more accurate prognostic model to facilitate the clinical evaluation.
Molecular markers, as useful adjuncts to the TNM staging system, have been continuously explored and tested in recent years [9,11].With the widespread clinical use of RNA sequencing and second-generation sequencing, more and more esophageal squamous carcinomas have clear molecular profiles.However, there is still a lack of well-established and relatively well-recognized molecular markers to reveal the prognostic characteristics of patients.
Previous studies showed that the disruption of apoptosis and cell cycle is an important factor in tumor formation [12][13][14][15].The BCL2 family comprises key regulators of apoptosis, and the high expression of genes in the BCL2 family could promote the cell proliferation of ESCC [12,13,16].In addition, cell cycle-related proteins, such as Rb, p53, and PCNA, play important roles in regulating the cell proliferation [17,18].Therefore, we selected three apoptosis-related (Bcl-2, Bcl2-L-4, and Caspase-3) and five cell cycle-related (Rb, p53, p27 Kip1 , p16 INK4 , and PCNA) antibodies further to explore the prognostic significance of those proteins in ESCC.The WNT signaling pathway (c-Myc, BCL-1, and Catenin beta-1 were selected in this study) and the PI3K-AKT signaling pathway (SPP-1, erbB-2, and EGFR) were selected in this study) can regulate the proliferation and metastasis of esophageal cancer cells, thus implying that proteins on those signaling pathways have the potential to indicate the prognostic characteristics of patients [19,20].Previous reports suggested that proteins of pathways in cancer (such as MMP-2, MMP-9, and COX2) and protein family (such as TIMP-2, TIMP-1, and Ki-67) were highly differentially expressed on esophageal cancer and para-cancerous tissues [21][22][23].Accordingly, we utilized the proteins associated with the above pathways to explore the prognostic characteristics of the patients.
Most of the studies have been conducted with limited follow-up periods and diverse treatment modalities, thus interfering with the predictive power of molecular models.Therefore, in this study, the expression levels of 23 enrolled molecules were evaluated in pT2/T3N0M0 ESCC patients with up to 20 years of follow-up, in order to more accurately predict the prognosis and stratify the death risk of patient.

Patients selection
The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).Our study was composed of patients who underwent surgery for primary ESCC at Sun Yat-sen University Cancer Center between 1993 and 2003.The selection criteria of patients are presented in Fig. 1.A total of 226 patients were enrolled, and the TNM stage was identified according to the eighth edition TNM classification system.Patient's clinical and pathological data included age at surgery, sex, tumor location, surgery, pathological T (pT) stage, and tumor grade.
system of manufacturer's instructions (Glostrup, Dako, Denmark).We selected proteins from apoptosis pathway (Bcl-2, Bcl2-L-4, and Caspase-3), WNT signaling pathway (c-Myc, BCL-1, and Catenin beta-1), pathways in cancer (MMP-2, MMP-9, PTEN, Cadherin-1, and COX2), cell cycle (p16 INK4 , Rb, p53, PCNA, and p27 Kip1 ), PI3K-AKT signaling pathway (SPP-1, EGFR, and erbB-2), protein families (HSP70, Ki-67, TIMP-1, TIMP-2, Galectin-3, and p63), and others (ID-1 and CD44v6) to use in this study.The detailed information about 27 primary antibodies is presented in Supplementary Table 1.Ventana OmniMap anti-rabbit and anti-rat antibodies were used as secondary antibodies.Staining intensity and extent were scored as previously reported [9].The intensity of immunostaining was calculated as point 0 for no staining, 1 for very weak staining, 2 for moderate staining, and 3 for intense staining.The extent of immunoreactive tumor cell scoring was evaluated as follows: 0 for 0% positive cells, 1 for 1-10% positive cells, 2 for 11-25% positive cells, 3 for 26-40% positive cells, and 4 for over 41% positive cells.The final quantitation for each staining was obtained by multiplying the two points mentioned above.The results of immunohistochemical staining were interpreted independently by two pathologists under double-blind conditions who didn't know any clinical or other pathological information about patients.When the results were inconsistent, they would perform a joint discussion to decide the final results.

Statistical analysis
The cut-off values of protein expression and molecular scores were calculated by R version 4.1.1(https://www.rproject.org/).Besides, the Lasso-Cox, Kaplan-Meier, and nomogram analyses were conducted by R version 4.1.1.Other analyses were performed using the software SPSS 25.0 (IBM SPSS, Inc., Armonk, NY).The proportions of categorical outcomes were assessed by Pearson's Chisquare test or Fisher's exact test.Univariable and multivariable Cox proportional hazard models were adopted to calculate the hazard ratio (HR) and 95% confidence interval (CI).Predictors (p < 0.2) in univariable analysis and the known affecting-prognosis factors were brought into a multivariable analysis.The statistical significance was considered as a p < 0.05 on two sides.The concordance index (C-index) was measured to compare the predicted ability between different models.OS was analyzed by using the Kaplan-Meier method, and the differences were compared by log-rank tests.The primary endpoint was overall survival.

Patient characteristics
In this study, men outnumbered women, constituting 72.5% of the patients.A total of 175 (77.4%) patients were aged 65 and below, whereas 51 (22.6%) were over 65.The majority of patients were diagnosed with grade I or II ESCC, comprising more than 75% of the whole cohort.In terms of the pT classification, most patients were diagnosed with T3 ESCC.The number of patients with low molecular score (MS-L) (n = 105) was smaller than that of patients with high molecular score (MS-H) (n = 121).The baseline characteristics of the enrolled group are shown in Table 1.

Lasso-Cox regression analyses of proteins
The cutoff values of proteins were calculated by R software and recorded in Supplementary Table 2, by which the patients were divided into low-and high-expression groups.LASSO regression analysis was performed on the whole 27 proteins with prognostic value, and the risk coefficient of each protein was evaluated.Finally, twentythree prognostic proteins were retained according to the minimum partial likelihood of deviance (Fig. 2A, B).

The development of a nomogram
We used the 23 proteins to construct a nomogram (Fig. 2C).The point of every protein is listed in Table 2.Then, we set a MS based on the points of 23 proteins.According to the analysis of R software, the cutoff value of MS was identified as 227.The survival curves were drawn by Kaplan-Meier analysis and showed that the cohort with MS-L had better 5-year and 10-year survival rates than the MS-H group (5-year OS: 51.0% vs. 8.0%; 10-year OS: 45.0% vs. 5.0%, all p < 0.001, Fig. 3A).

Univariable and multivariable analyses
The outcomes of univariable and multivariable analyses are presented in Table 3.In order to discriminate the prognostic factors, a total of 6 variables were included in the univariable Cox regression analysis.MS (p < 0.001) and pT (p = 0.036) were considered to aggravate the survival.Furthermore, multivariable analysis confirmed MS as an independent prognostic factor after eliminating the confounding factors (HR = 3.220, p < 0.001).

Combination of pT and MS
The pT classification, as a classical indicator, was confirmed to be able to differentiate the prognosis of ESCC patients by Kaplan-Meier analysis (Log-rank: p = 0.029, Fig. 3B).However, the combination of pT and MS could classify survival curves evidently (overall p < 0.001, Fig. 4).Besides, Table 4 showed the univariable analysis based on the combination of the pT and MS.We calculated the C-index to compare the predictive ability between pT alone and the combination of the pT and MS, which showed that the prognostic prediction efficiency was improved significantly by the combination of the pT and MS (C-index = 0.656, SE = 0.02) than by the classical pT classification (C-index = 0.539, SE = 0.02, p < 0.001).

Discussion
In the present study, we collected the data of ESCC patients, including clinical, pathological, and protein expression information.We performed the Lasso-Cox regression, Cox regression, and nomogram to explore the predictive effect of protein expression in ESCC patients.Then, we constructed a molecular score based on the protein impacting prognosis.After adjusting for other confounders, the molecular score was identified as an independent factor, which could classify survival curves evidently.We combined the MS and TNM staging system to predict the survival of ESCC patients.Overall, the predictive ability was better in the combination of the MS and pT staging system than in the traditional pT staging system alone.Therefore, we suggest that the molecular characteristics could improve the prognostic predictive of ESCC patients.Besides, those results inform the inclusion of molecular features in the future TNM staging system of ESCC.
It's important to evaluate the molecular features of ESCC patients.Previous studies from basic experiments found that the expression of MMP2 could affect the migration, invasion, and metastasis of ESCC cells [24,25].A study from Li et al. reported that the expression of MMP2 in ESCC was significantly associated with tumor invasion depth and lymph node metastasis [26].Tumor invasion and lymph node metastasis imply high levels of T and N classification and a poor prognosis.However, there was no survival analysis in their research.
Besides, the abovementioned study had a small sample size (n = 58) but included complex TNM staging that encompassed stages I-III diseases, making it difficult to state that the same stage carrying different genetic features would have different prognostic characteristics.In the present study, we provided the prognostic data over 20 years and confirmed that T2/3N0M0 ESCC patients with the higher expression of MMP2 had poorer survival.
Recently, COX2 has become the hotspot in anti-cancer research, and it has been demonstrated that COX2 could play an important role in tumor promotion and the sensitivity of immunotherapy [27][28][29].High expression of COX2 can promote the proliferation, angiogenesis, invasiveness, metastasis, and also the inhibition of apoptosis and immuno-surveillance [27,30].A brief report found that a combination of acetaminophen (COX1/2 inhibitor) and anti-PD-1 treatment significantly reduced the treatment efficacy compared with anti-PD-1 treatment alone in the Lewis lung carcinoma mouse model [28].In our study, the high expression of COX2 indicated worse outcomes for ESCC patients.Therefore, these results suggest that molecular signatures not only provide an indication of a patient's prognosis but also serve as an indicator of therapeutic efficacy or as a potential target for therapy.
Adding molecular characterization to clinical data really improves the stratification of patient prognosis.In our previous study, the improvement in the accuracy of prognostic modeling with the addition of the assessment of PD-L1 expression levels was significant (C-index increased by 0.204) [9].Another study by us showed that the expression of FOXD1 could serve as an independent prognosticator for head and neck squamous cell carcinoma, which indicated that adding the expression of FOXD1 into the prognostic assessment tool, such as  TNM system, could improve the prognostic prediction [31].A report by Liu et al. found that CAV1 promoted glioma progression, and its high expression was related to the poor prognosis [32].Similarly, the expression of CAV1 also indicated the prognosis independently.A study by Tang et al. confirmed that the high expression of RRM2 enhanced esophageal cancer's radiotherapy resistance and shortened those patients' survival [33].Those studies mentioned above demonstrate the value of molecular profiling in the prognostic assessment of patients with malignancies.The nomogram, as an individualized scoring tool, is useful in clinical practice.The development and use of nomogram has improved cancer patients' prognostic assessment [34][35][36][37].In this study, we used both of these tools: molecular characterization and nomogram.Importantly, our follow-up is over 20 years, giving our model better prognostic prediction stability.This study still has some drawbacks.First, the sample size was relatively small, although all patients belonged to T2/3N0M0 diseases.Further large multicentric studies are required to research and verify our results.Second, Although the follow-up period was long, the recurrence of the patients was difficult to document, and therefore, it is not possible to give information on the role of molecular features on disease recurrence in this study.Third, we chose 27 proteins, but we did not explore the interrelationships between these proteins.Fourth, considering the inherent shortcomings of retrospective studies, there is still a need for prospective studies to further explore the significance that molecular characterization brings to patient prognostic assessment and therapeutic decision-making.

Conclusions
Our study suggested an MS for significant clinical stratification of T2/3N0M0 ESCC patients to screen out subgroups with poor prognoses.Besides, the combination of TNM staging and MS could predict survival more accurately for this cohort than the pT staging system alone.

Fig. 1
Fig. 1 The flowchart of patient selection

Fig. 2
Fig. 2 The procession of Lasso-Cox model (A and B).The nomogram based on selected proteins from the Lasso-Cox model (C)

Fig. 3
Fig. 3 The survival curves based on the molecular score (A) and the pT classification (B)

Fig. 4
Fig. 4 The survival curves based on the combination of molecular score and pT classification

Table 1
The baseline information of esophageal squamous cell carcinoma *This comparison was based on the Fisher's exact test.Other comparisons were based on the Chi-square test

Table 2
The point of every protein screened by Lasso-Cox analyses

Table 3
Univariable and multivariable analyses for the cohort

Table 4
Univariable analysis for the combination of pT and MS